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hardlimited  version.  However,  by  deliberately  adding  noise  to  samples  of  the  signal 
prior  to  hardllmlting,  it  is  shown  that  the  signal  can  be  estimated  consistently 
from  its  hardlimited  noisy  samples  as  the  sampling  rate  tends  to  infinity.  In  fact, 
such  estimates  are  shown  to  converge  with  probability  one  to  the  signal  and  also, 
to  be  asymptotically  normal.  The  estimates,  which  are  generally  nonlinear,  can  be 
made  linear  by  a  proper  choice  of  the  noise  distribution.  These  rather  unexpected 
results  hold  for  all  bounded  and  uniformly  continuous  signals.  In  addition  to  the 
hardllmlter,  such  results  are  also  established  for  certain  monotonic  and  non¬ 
monotonic  nonlinearities. 


The  work  of  E.  Masry  was  supported  by  the  Office  of  Naval  Research  under 
Contract  N00014-75-C-0652.  The  work  of  S.  Cambanls  was  supported  by  the  Air  Force 
Office  of  Scientific  Research  under  Grant  AFOSR-75-2796. 

E.  Masry  is  with  the  Electrical  Engineering  &  Computer  Sciences  Depart¬ 
ment,  University  of  California  at  San  Diego,  La  Jolla,  CA  92093. 

S.  Cambanls  is  with  the  Department  of  Statistics,  University  of  North 
Carolina,  Chapel  Hill,  NC  27514. 


80  3  24  008 


I.  INTRODUCTION 

In  this  paper  we  study  the  problem  of  reconstructing  a  real  signal  s(t) 
defined  on  an  interval  I,  from  certain  nonlinear  transformations  of  its  samples 
{s(k/VI)>k  that  are  deliberately  corrupted  by  additive  noise  {Xk}k,  i.e.,  from 
(f[s(k/W)  +  Xk]>k  where  f (x)  is  a  memoryless  nonlinearity  such  as  a  hardlimiter. 
Under  appropriate  conditions  it  is  shown  that  a  properly  chosen,  generally  non¬ 
linear,  estimate  sw( t)  of  s(t)  converges  in  quadratic  mean,  as  well  as  with  proba¬ 
bility  one,  to  s(t)  as  the  sampling  rate  W  tends  to  infinity.  It  should  be  pointed 
out  that  the  memoryless  nonlinearity  f(x)  need  not  be  one-to-one  so  that  the  signal 
s(t)  cannot,  in  general,  be  recovered  from  {f[s(k/W)]}k  as  W  tends  to  infinity,  in 
the  absence  of  the  additive  noise  {Xk}.  It  is  the  deliberate  addition  of  the  noise 
that  makes  the  reconstruction  of  the  signal  feasible. 

This  work  is  motivated  by  the  observation  that  an  arbitrary  continuous  func¬ 
tion  s(t),  —><t«»  cannot,  in  general,  be  reconstructed  from  its  sign,  sgn[s(t)], 
—<t«*>.  This  situation  remains  true  even  when  the  function  s(t)  is  analytic,  e.g., 
bandlimited.  We  recall  that  for  a  bandlimited  function  s(t)  *  Jw  e1tAS(x)dx, 

$(a)  €  t-j[-W,W],  we  have  by  Titchmarsh's  theorem  [1]  the  conditionally  convergent 
product  s(t)  =  s(0)  n"_j(l  -  t/zn)  where  s(0)  t  0  and  (zn>  is  the  set  of  all  (real 
and  complex)  zeros  of  s(z),  z  *  t+iu,  in  the  complex  plane.  Thus  s(t)  cannot  be 
determined  from  its  zero  crossings  since  the  complex  zeros  are  not  observable. 

Duffin  and  Schaeffer  [2]  have  shown  that  the  function  r(z)  k  C  cosWz-s(z), 

oft 

C  >  supt|s(t)|,  has  real  simple  zeros  { tn>  only  and  r(t)  *  r(0)  nn_-j ( 1  -t/tn)  so 
that 

s{ t)  =  C  cosWt  -  [C-s(O)]  n  (l  -  ~  \ 

n=l  v  lr\  1 

Hence,  s(t)  can  be  detennined  by  the  zero  crossings  of  C  cosWt-s(t).  This  result 
has  found  no  practical  use  in  conmunl cation  systems  since  the  identification  of  the 
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zero  crossing  points  (tn>  of  C  cosWt-s(t)  as  well  as  the  formation  of  the  infinite 
product  nn(l-t/tn)  are  not  easily  implemented.  More  significantly,  no  digital  re¬ 
construction  scheme  of  s(t),  based  on  samples  of  sgn[C  cosWt-s(t)],  is  available. 

It  will  be  shown  in  Section  II  that  for  all  bounded  uniformly  continuous 
signals  s(t)  (not  necessarily  bandlimited)  we  have  estimates  Sy(t)  of  s(t),  based 
on  the  binary  data  {sgn[s(k/W)+Xk ]>k,  which  converge  with  probability  one  to  s(t), 
as  the  sampling  rate  W  tends  to  infinity.  It  is  the  deliberate  corruption  of  the 
samples  of  the  signal  by  the  noise,  before  hardlimiting,  that  makes  it  possible  to 
reconstruct  s(t)  from  the  output  of  the  hardlimiter.  Moreover,  by  properly  choos¬ 
ing  the  distribution  of  the  noise,  we  can  make  the  estimate  to  be  linear. 

The  general  problem  can  be  modelled  as  a  transmitter/ receiver  (with  no 
channel  noise)  with  a  structure  depicted  in  Figure  1.  A  continuous-time  signal  s(t) 
on  an  interval  I  is  sampled  at  equally-spaced  points  {k/W>k  in  I  where  W  is  the 
sampling  rate.  The  samples  {s( k/W) } k  are  then  deliberately  corrupted  by  additive 
noise  {Xk>k  which  is  a  sequence  of  independent  identically  distributed  random  vari¬ 
ables  whose  distribution  is  specified  below.  The  noisy  samples  {YWjk  =  s( k/W)+Xk> k 
are  passed  through  a  given  memoryless  nonlinearity  f(x)  which  need  not  be  monotonic, 
a  typical  example  being  a  hardlimiter.  Its  output  sequence  (Zyjk  =  f ( Yw,k) >k  15 
transmitted.  The  receiver  structure  is  generally  nonlinear  and  consists  of  a 
linear  system  hw  =  {hw( t , k) >k  cascaded  with  a  memoryless  nonlinearity  g(x).  The 
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output  mw( t)  of  the  linear  system  hy  is  given  by 

my(t)  =  £  Zw^  hw(t,k),  tel  .  (1) 

The  choice  of  the  linear  system  does  not  depend  on  the  signal  s(t)  nor  on  the  dis¬ 
tribution  of  the  noise  nor  on  the  nonlinearity  f(x)  in  the  transmitter;  it  only 
depends  on  the  time  interval  I.  On  the  other  hand,  the  nonlinearity  g(x)  in  the 
receiver  is  determined  by  the  distribution  of  the  noise  and  the  nonlinearity  f(x) 
in  the  transmitter.  The  estimate  sw( t)  of  s(t)  is  defined  by 


sw(t)  =  g[my(t)3  ,  t  €  I  . 


(2) 


The  main  results  of  the  paper  are  the  mean-square  consistency  of  the  esti¬ 
mate  (2)  (Theorems  2.1  and  3.1),  Its  strong  consistency  (Theorems  2.3  and  3.3), 
and  a  central  limit  theorem  for  the  error  sM(t)-s(t)  (Theorems  2.4  and  3.4).  Of 
possible  independent  interest  are  the  convergence  properties  (Theorems  4. 0-4.1)  of 
ir^(t)  as  an  estimate  of  the  mean  function  m(t)  *  E[f(s(t)+X)]  of  the  output  of  the 
nonlinearity  f. 

The  feasibility  of  the  reconstruction  of  the  signal  was  suggested  by  the 
results  of  a  recent  paper  [3]  by  the  authors;  according  to  which  s(t)  can  be 
determined  from  the  mean  function  m(t)  =  E[f(s(t)+X)].  This  suggests  that  an  esti¬ 
mate  of  s(t)  can  be  obtained  from  an  estimate  of  m(t)  via  (2).  The  form  of  the 
estimate  (1)  for  m(t),  i.e.,  the  linear  system  in  the  receiver,  was  motivated  by 
the  work  of  Dorogovcev  [4]  on  the  nonparametric  estimation  of  regression  functions. 

Throughout  the  manuscript  we  shall  assume  that  s(t)  belongs  to  the  follow¬ 
ing  class  of  signals. 

Assumption  A.  Let  b  be  a  fixed  known  positive  constant,  and  s(t)  be  any 
uniformly  continuous  function  on  the  interval  I  (finite  or  infinite)  satisfying 
|s(t) |  <  b  for  all  t  €  I. 

As  a  consequence,  the  receiver  structure  and  the  convergence  results  of 
this  paper  are  nonparametric  in  the  signal.  Incidentally,  additional  assumptions 
on  the  signal,  such  as  differentiability  or  bandlimitness,  do  not  provide  an 
improvement  in  the  rate  of  convergence. 

The  organization  of  the  paper  is  as  follows:  Due  to  its  apparent  practical 
significance,  the  case  of  a  hardlimiter,  f(x)  =  sgn  x,  is  presented  and  discussed 
separately  in  Section  II.  The  general  case  is  considered  in  Section  III.  In 
Section  IV  the  convergence  properties  of  the  estimate  m^(t)  are  obtained.  The 
derivations  of  the  theorems  stated  in  Sections  II  and  III  are  given  in  Section  V. 

Throughout  this  paper,  the  expressions  o(>)  and  o(*)  as  W  •  are  uniform 
in  t  over  closed  and  bounded  intervals  in  the  interior  of  the  interval  I.  This 
property  will  not  be  repeated  in  the  statements  of  the  theorems. 


II.  THE  HARDLIMITER  CASE 


In  this  section  we  consider  the  hardlimiter  case,  f(x)  =  sgn  x,  for  which 
the  transmitted  data  is  binary.  We  make  the  following  assumptions. 

(i)  The  signal  s(t)  satisfies  assumption  A  and  the  interval  I  is  either 
[0,1]  or  [0,«)  (other  choices  are  discussed  In  Section  III). 

( i 1 )  The  distribution  of  the  noise  X  is  either  normal  with  mean  zero, 
known  variance  o2  and  density  <Hx;o),  or  uniform  over  [-b,b].  (Other  appropriate 
distributions,  such  as  Laplacian,  could  also  be  used.) 

Define  the  function  u(s)  by 
u(s)  =  E[sgn(s+X)],  -®<s<“  . 

When  X  is  normal,  u(s)  is  given  by 
_ _ /-s/o  .u2/2 

uN(s)  *  Jit *  e  clu,  <s<»  ,  (3a) 

0 

and  when  X  is  uniform  over  [-b,b],  we  have 
/  -1  ,  s  <  -b 

pjj(s)  *  I  s/b  ,  -b  5  s  <  b  (3b) 

VI  ,  b  <  s  . 

Note  that  uN(s)  and  mj(s)  are  strictly  monotonic  over  (-*,»)  and  [>b,b],  respectively. 

We  now  specify  the  structure  of  the  receiver.  When  X  Is  normal,  the  non- 
lineartly  g(x)  is  chosen  as 

!P  N1  (x)  ,  |x|<U||(c) 

,  c  ■  b+e,  e  >  0  .  (4a) 

o  .  J  X I  >  uN(c) 

When  X  is  uniform  over  [-b,b],  the  nonlinearity  g(x)  is  chosen  as 


The  choice  of  the  linear  system  hw  *  (hy(t,k)}k  depends  only  on  the  interval  I. 


When  I  =  [0,-),  hy  is  defined  by 

k 

hy(t,k)  =  e_Wt  ,  k  =  0,1 . t  >  0,  W  >  0  ,  (5) 

and  when  I  *  [0,1],  by 

hw(t,k)  =  (“)  tk(l-t)W_k  ,  k  =  0,1 . W,  0  <  t  <  1  ,  (6) 

W:  positive  integer. 

A  more  general  class  of  linear  systems  is  considered  in  Section  III.  With 

^(t)  =  l  sgnfsjJl)  +  Xk]hw(t,k),  t€I  ,  ‘  (7) 

k  t  W  * 

representing  the  output  of  either  linear  system  (5)  or  (6),  the  estimate  Sy(t)  of 
s(t)  is  given  by 

sw(t)  =  gN[my{t)],  t  €  I  (8a) 

when  X  is  normal,  and  by 

sw(t)  =  b  iriy(t) ,  t  6  I  (8b) 


when  X  is  uniform  over  [-b,b],  (Since  |itiy(t)|si,  only  the  linear  portion  of  gy(x) 
is  used.)  Thus  in  the  latter  case,  the  estimate  sw(t)  is  linear  in  the  data 
{sgn[s(k/W)+Xk]>k. 

Our  first  result  shows  the  mean-square  consistency  of  the  estimate  sw(t) 
and  provides  bounds  on  the  rate  of  convergence;  it  is  states  in  terms  of  the 
modulus  of  continuity  of  s(t)  defined  by 

u(s;6)  =  suP(t,t'€I:  |t-t'|<  6}  ls^  "  <5  >  0 

Theorem  2.1.  (a)  If  I  =  [0,1]  and  the  linear  system  is  determined  by  (6) 
then  for  every  0  <  t  <  1,  the  estimates  sw(t),  given  by  (8a)  and  (8b),  converge  in 
the  mean-square  sense  to  s(t),  as  W  -►  »,  and 
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2 


(9a) 


(b)  if  I  =  [0,®)  and  the  linear  system  is  determined  by  (5)  then  for 
every  t  >  0,  the  estimates  sw(t),  given  by  (8a)  and  (8b),  converge  in  the  mean- 
square  sense  to  s(t),  as  W  -►  *,  and 


E[iw(t)-s(t)]2  <  Kt  w2(s;  /t7W)  +  K2  e‘2Wt  IQ(2Wt)  , 


(9b) 


where  Iq(x)  is  the  modified  Bessel  function  of  the  1st  kind  of  order  2ero  and 
exp(-2Wt)IQ(2Wt)  =  [1  +  o(l)]//4lfwr. 

The  constants  K-j,  K2  are  the  same  for  both  parts  (a)  and  (b)  and  are  given 
as  follows  for  the  estimates  (8a)  and  (8b). 


For  (8a):  K]  =  K£  , 

7TO 


.  1  +  (b/e)2 
2  4<J>2(b+e;  a) 


For  (85):  K]  =4,  *  b2  . 


In  the  bounds  (9)  on  the  mean-square  error,  the  first  term  is  due  to  the 
bias  of  the  estimate  whereas  the  second  term  is  due  to  its  variance;  the  bias 
depends  on  the  modulus  of  continuity  of  the  signal  whereas  the  variance  is  always 
o(W  ).  For  example,  if  s(t)  is  Lip  y,  0  <ys  1,  then  the  mean-square  error  is 
o(W-min(y  ,1/2)) ,  an£j  fQr  ^  >  1/2,  it  is  dominated  by  the  variance  and  is  o( W"^2). 
Additional  smoothness  conditions  on  the  signal  s(t),  such  as  differentiability, 
would  provide  faster  convergence  rate  for  the  bias  but  would  not  improve  the  rate 
of  convergence  of  the  mean-square  error. 

2 

When  X  is  normal,  the  constants  K-j  and  K2  depend  on  the  variance  o  of  X 
and  on  t  (cf.  (4a)).  When  the  variance  is  asymptotically  dominant,  e.g.  if  the 
signal  is  Lip  y  with  1/2  <  y<l,  asymptotically  optimal  choices  for  a  and  e  can  be 
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found  by  minimizing  <2;  we  find  a  =  2b  and  e  *  b  for  which  =  4ireb  and  K-|  =  8e 
(and  these  values  are  larger  than  those  when  X  is  uniform). 

The  next  theorem  shows  that  the  estimates  sw(t)  converge  to  s ( t)  in  the 
2&th  mean  for  every  integer  1  2  1  and  that  faster  rates  of  convergence  are  avail¬ 
able  in  this  case. 

Theorem  2.2.  Let  s(t)  be  Lip  y  on  1 ,  0  <  y  £  1 .  Then  for  all  t  in  the 
interior  of  I  and  for  every  integer  1  2  1  the  estimates  (8a)  -  (8b)  satisfy 

E[sw(t)-s(t)]2  =  o(W"£mi‘n(Y*1/2))  . 

From  the  practical  point  of  view,  convergence  of  the  estimate  s^( t )  to  s(t) 
with  probability  one  (rather  than  in  the  mean)  is  preferable  so  that  s(t)  can  be 
reconstructed  from  almost  every  realization  of  the  data  (sgn[s(k/W)+Xk]}k  ,  i.e., 

corresponding  to  almost  every  realization  of  the  noise  sequence  {Xk>k.  This  strong 
consistency  of  the  estimate  sw(t),  along  with  its  rate  of  convergence,  is  given  in 
the  next  theorem. 

Theorem  2.3.  Let  s(t)  be  Lip  y  on  I,  0  <  y  £  1 ,  and  let  a  be  any  constant 
satisfying  0  <  a  <  (l/2)min(y,l/2). 

(a)  If  the  linear  system  is  determined  by  (5)  then  for  each  fixed  t  €  (0,») 
and  each  fixed  sequence  Wn+  »  as  n  +»,  we  have  with  probability  one 

(Wn)0|sWn(t)  -  s(t)|-  0  as  n-« 

(b)  If  the  linear  system  is  determined  by  (6),  then  for  each  fixed  t  €  (0,1) 
and  with  W=n,  a  positive  integer,  we  have  with  probability  one 

na|sn(t)-s(t)!  -*  0  as  n  -*■  »  . 

Our  final  result  in  this  section  provides  a  central  limit  theorem  for  the 
error  sw(t)-s(t).  When  the  noise  X  is  uniform,  we  shall  assume  that  it  is  uniform 
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over  [-c,c]  with  c  >  b,  in  which  case  the  estimate  (8fc)  is  replaced  by 
sw(t)  =  c  myf  t) - 


Theorem  2.4.  Let  s( t)  be  Lip  y  on  I,  1/2  <  ysl.  Define 
sw(t)  =  8w(t)[sw(t)-s(t)J,  t  £  I  , 

where  the  normalizing  factor  gy(t)  is  specified  below. 

(a)  If  I  =  [0,®)  and  the  linear  system  is  determined  by  (5)  then  the  values 
of  the  process  {Sy(t),  0  <  t  <  «}  at  distinct  instants  {t.}  are  asymptotically  inde¬ 
pendent  standard  normal  variables  as  W  -*■  ». 

(b)  If  I  =  [0,1]  and  the  linear  system  is  determined  by  (6)  then  for  each 
fixed  t  €  (0,1),  s^{ t )  is  asymptotically  standard  normal  variable  as  W  ®. 

When  X  is  uniform  over  [-c,c],  is  given  by 

8w(t)  =  {c2  Var[mw(t)]}'1/2  , 

and  when  X  is  normal ,  by 

BW( t)  =  2  <|>[s(t),  a]  (Var[mw(t)]}'1/2 

Upper  and  lower  bounds  on  ew(t)  can  be  obtained  from  the  bounds  (35)  on  Var[niy(t)]. 


The  asymptotic  normality  of  i^(t)  can  be  used  to  obtain  confidence  intervals 
for  the  error  sw(t)-s(t)  by  using  the  bounds  on  Bw(t). 

We  conclude  this  section  with  some  practical  comments  on  the  various  trans¬ 
mitter/receiver  combinations.  Clearly,  the  simplest  transmitter  uses  uniformly 
distributed  noise  and  the  corresponding  receiver  is  then  linear.  The  "Bernstein" 
linear  receiver  would  be  the  simplest  to  use  since  it  employs  a  finite  number 
(W+l)  of  samples  to  reconstruct  the  signal  over  the  interval  [0,1].  The  actual 
sampling  rate  W  to  be  used  can  be  determined  from  Theorem  2.1  to  correspond  to  an 
acceptable  mean-square  error.  For  signals  defined  over  [0,°°),  aside  from  using  the 
"Sz4sz"  linear  receiver,  one  could  also  use  the  "Bernstein"  linear  receiver 
sequentially  over  consecutive  intervals  of  unit  length. 
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III.  THE  GENERAL  CASE 


In  this  section  we  consider  general  (nonconstant)  nonlinearities  f(x)  in 
the  transmitter  and,  under  appropriate  conditions  on  f(x),  we  specify  noise  dis¬ 
tributions,  linear  systems  hw,  and  memoryless  nonlinearities  g(x)  such  that  sw(t), 
defined  by  (2)  and  (1),  is  a  consistent  estimate  of  the  signal  s ( t )  as  H  + 

Theorems  3. 1-3.4  contain  Theorems  2. 1-2.4  as  special  cases. 

We  first  specify  the  distribution  of  the  noise  X,  introduce  appropriate 
assumptions  on  f(x),  and  specify  the  memoryless  nonlinearity  g(x)  in  the  receiver. 
There  are  two  types  of  symmetric  distributions  appropriate  here,  those  supported 
by  the  entire  real  line  (-  ®,°°),  and  those  supported  by  the  finite  interval  [-b,b]. 
For  the  sake  of  concreteness  we  will  concentrate  on  two  such  typical  distributions, 

o 

the  normal  N(0,o  )  with  density  <t>(x;o)  and  the  uniform  over  [-b,b].  We  shall  use 
the  function  y(s)  defined  by 

y(s)  =  E[f (s+X) ] ,  -oo  <  S  <  »  .  (10) 

Clearly,  y(s)  depends  on  f  and  on  the  distribution  of  X.  When  X  is  N(0,o2),  we 
have 


co 

uN(s)  =J  f ( s+x )  <j>(x;o)dx  , 

— oo 


and  when  X  is  uniform  over  [-b,b]  we  have 
.b 
-b 


1  fb 

U y ( s )  =  2b  J  f(s+x)dx  • 


In  the  particular  case  when  f(x)  =  sgn  x,  ^(s)  and  yN(s)  are  given  by  (3).  For 
monotonic  nonlinearities  f ( x)  (which  need  not  be  strictly  monotonic,  e.g.,  the 
hardlimiter) ,  and  for  the  case  of  nonlinearities  f(x)  described  in  (B2)  below 
(which  need  not  be  monotonic,  e.g.  f(x)  =  x  -  o  x),  it  has  been  shown  in  [3] 
that  uN(s)  is  strictly  monotonic,  and  thus  its  inverse  u^U)  exists.  We  now 
specify  the  memoryless  nonlinearity  g(x)  in  the  receiver,  for  various  classes  of 
transmi tter-nonl ineari ties  f(x)  and  noise  distributions. 
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Assumption  8.  We  say  that  (B)  is  satisfied  if  any  one  of  (Bl),  (B2), 
or  (B3)  is  satisfied. 

i.  X  is  N(0,a2). 

ii.  f(x)  €  L4[<ji(x;a)dx]  and  is  monotonic  (not  necessarily  strictly). 
:](x).  u(-c)  <  x  <  w(c) 

iii.  g(x)  -  <  ,c=b+e,  e>0. 

(  0  ,  otherwise 

i.  X  is  N(0,a2). 

ii.  f(x)  €  L 4C <i>(x ; a)dx]  is  an  odd  function  and  has  nonnegative  Hermite 

coefficients  fek >k  with  e^  >  0.  (See  [3].) 

iii.  g(x)  =  u^(x)  for  -*  <  x  <  ». 

i.  X  is  uniform  over  [-b,b]. 

ii.  f(x)  =  sgn  x. 

J  bx,  |x|  <  1 

iii.  g(x)  =  j 

(  0  ,  |xj  >  1  . 

Our  first  result  shows  the  mean-square  consistency  of  the  estimate 
sw(t)  under  general  conditions  on  the  linear  system  hw  =  {hw(t,k)lk  . 

9 

Theorem  3.0.  Let  Assumptions  (A)  and  (B)  be  satisfied.  For  every 
t  €  I  for  which 

i.  hw(t,k)  2  0,  for  all  k, 

ii •  l  Mt,k)  =  1  , 
k  w 

iii.  [  (t  -  jj)  hw(t,k)  0  as  W  ■+  - 

iv.  I  hj(t,k)  ->  0  as  W  -►  • 
k  w 

the  estimate  (2)  converges  in  quadratic  mean  to  s(t)  as  W  »  . 

The  first  condition  on  hw  makes  the  linear  system  a  positive  linear 
operator,  the  second  is  a  summabil ity/normal ization  condition,  the  third  guarantees 


(Bl): 


(B2): 


(B3): 


1] 


that  the  bias  of  the  estimate  tends  to  zero,  and  the  fourth  condition  guarantees 
that  the  variance  of  the  estimate  goes  to  zero.  A  large  class  of  linear  systems 
hy  satisfying  the  conditions  of  Theorem  3.0  can  be  obtained  as  follows. 

Proposition  3.0.  Let  {£,.}~_i  be  a  sequence  of  independent  identically 
distributed  random  variables  with  integer  values,  mean  t  €  I,  and  finite  second 
moments.  Then  ih  (t,k)>,  defined 

n  i\ 

hp(t,k)  =  Pr{^  +  •••+  £n  =  k),  n  =  1 ,2,. . . ,k=0,±l ,. . .  Ill) 

satisfies  assumptions  (i)-(iv)  of  Theorem  3.0  (with  W  taking  positive  integer  values) 
for  every  t  €  I  for  which  the  random  variable  ^  is  not  degenerate. 

Positive  linear  operators  of  the  type  described  in  Proposition  3.0  have 
been  considered  in  the  approximation  theory  literature  [5],  where  conditions  (i)-(iii) 
of  Theorem  3.0  are  established  and  used  for  the  interpolation  of  continuous  functions 
m(t)  on  I  by  m(k/n)hn(t,k).  We  mention  two  examples:  When  each  £.  takes  on  the 
values  0  and  1  with  Pr(ci  =  1}  =  t,  then  hp  is  given  by  (6)  and  represents  the 
Bernstein  operator.  When  each  is  Poisson  with  parameter  t,  then  hp  is  given  by 
(5)  (with  W  =  n)  and  represents  the  Szasz  operator. 

Theorem  3.0,  while  guaranteeing  mean-square  consistency  of  the  estimate 
sw(t),  provides  no  bounds  on  the  rate  of  convergence.  We  shall  derive  such  bounds 
for  linear  systems  hw  corresponding  to  the  class  of  generalized  Szasz  operators  [6] 
(see  below)  and  to  the  Bernstein  operator.  While  the  Szasz  operator  (5)  can  be 
generated  as  in  Proposition  3.0,  the  class  of  generalized  Szasz  operators  cannot. 

We  consider  the  entire  class  of  generalized  Szasz  operators,  rather  than  the  single 
Szasz  operator,  because  with  no  additional  work  we  obtain  the  same  rates  of  con¬ 
vergence  for  this  entire  class  of  linear  systems. 

We  now  introduce  the  generalized  Szasz  operators.  Let  A(z)  =  apzn 
be  an  analytic  function  in  |z|  <  R,  for  some  R  >  1,  and  suppose  that  for  all  n 


a  >  0  and  A(l)  =  J  a„  >  0  . 

n  n=0  n 

The  Appel  polynomials  [7]  p.  (u),  u  >  0,  are  defined  by  their  generating  function 


A(z)  euz  =  l  pk(u)zk  , 
k=0  K 


(12) 


i.e.. 


Pk(U)  =  jl0  "k-J  7!  ’ 

The  generalized  Szasz  operator  is  represented  by  hw  =  {hw(t ,k) >k  with 

-Wt 


(13) 


hw(t,k)  =  Pk(Wt)  ,  k  =  0,1 . t  >  0,  W  >  0  . 

L 

operator  corresponds  to  A(z)  =  1  for  which  Pk(u)  ■  u  /k !  . 

The  following  assumption  specifies  the  interval  I  and  the  linear  system 

Assumption  C.  We  say  that  (C)  is  satisfied  if  either  (Cl)  or  (C2)  is 


"W 


satisfied. 

(Cl):  I  =  [0,«>)  and  hw  is  a  generalized  Szasz  operator  defined  by  (13)  and  (12). 

(C2):  I  =  [0,1]  and  hy  is  the  Bernstein  operator  defined  by  (6). 


We  shall  therefore  concentrate  on  signals  s(t)  defined  on  the  positive 
real  line  [0,a>)  or  the  unit  interval  [0,1].  By  appropriate  scaling  one  can  simi¬ 
larly  consider  signals  defined  on  any  half  line  or  any  finite  interval.  The  case 
of  signals  defined  on  the  entire  real  line  can  be  reduced  to  the  positive  real  line 
by  separately  considering  the  parts  of  the  signal  on  [0,=°)  and  (-»,0j. 

All  the  following  results  hold  under  Assumptions  (A),  (B)  and  (C). 
Assumption  (A)  states  the  conditions  on  the  signal  s(t),  Assumption  (B)  determines 
the  nonlinearity  g(x)  in  the  receiver,  and  Assumption  (C)  determines  the  linear 
system  h^  in  the  receiver. 

Our  next  result  provides  upper  bounds  on  the  mean-square  estimation  error. 
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Theorem  3.1.  Under  Assumptions  (A),  (B)  and  (C)  we  have  for  each  t  6  I, 
that  the  estimate  (2)  satisfies 

E[Sw(t)-s(t)]2  <  K1  u»2<s;aw(t))  +  v2(t) 

where  ay(t)  and  v^(t)  are  determined  by  (C), 


for  (Cl):  c2(t)  =  i  +  A~0?  +  ym  , 


A( 1 )W‘ 


v2(t) 


,  1  +  0(1) 
2  /irWt 


for  (C2) :  a2(t)  =  ,  v2(t)  - 


1  +  o(D 

2  / irVlt(l-t) 


» 


and  the  constants  and  K2  are  determined  by  (B), 


for  (Bl):  K1  =  4Q2(q"2  +  (b/A)2), 

for  (B2) :  K]  =  4Q2/q2 

for  (B3) :  K]  =  4b2Q2 

and  the  constants  q,  Q,  U2  and  a  are 


K2  =  U2(q'2  +  (b/A)2) 
K2  =  U2/q2 

k2  =  b2u2  , 

defined  in  (17). 


It  follows  that  sM(t)  converges  to  s(t)  in  quadratic  mean  for  every  t 
in  the  interior  of  the  interval  I,  i.e.,  for  t  >  0  under  (Cl)  and  0  <  t  <  1  under 
(C2) .  Also,  for  the  entire  class  of  generalized  Szasz  operators,  the  rate  of 
decay  of  a2j(t)  and  v^(t)  as  W  -*■  «  is  the  same,  o(l/W)  and  o(l//w),  respectively, 
and  thus  the  rate  of  convergence  of  the  bound  on  the  mean-square  error  is  also  the 
same.  This  rate  is  also  identical  to  that  of  the  Bernstein  operator.  For  example, 
when  s(t)  €  Lip  y,  0  <  y  s  1,  the  mean-square  error  is  o(W_n”n^Y’^2b  for  all 
choices  of  linear  and  nonlinear  systems  hw  and  g(x),  covered  by  Theorem  3.1. 

Bounds  on  the  higher  order  moments  of  the  estimation  error  can  be  ob¬ 
tained  in  a  similar  manner  and  they  provide  faster  rates  of  convergence. 
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Theorem  3.2.  Assume  that  s(t)  is  Lipy  on  I,  0  <  y  s  1,  and  that 

Assumptions  (A),  (6)  and  (C)  are  satisfied.  Lett  be  a  positive  integer  and  under 

(Bl)  or  (B2)  assume,  in  addition,  that  f(x)  €  ^[♦(xjojdx].  Then  for  every  t  in 

the  interior  of  I,  the  estimate  (2)  converges  in  the  2i ^  mean  to  s(t)  as  W  -*■  • 

and  for  some  continuous  function  K.  (t), 

* 

K  ( t  \ 

tryti-sit)]2*  s  [i  *  o(in. 

The  exact  expression  for  K  (t)  is  quite  involved  but  easily  expressed 

*  *  Y 

in  terms  of  F  (t),  introduced  in  the  proof  of  Theorem  4.2,  and  the  constants  in 

Y 

Proposition  5.1.  The  bound  of  Theorem  3.2  can  be  used  to  obtain  the  strong  con¬ 
sistency  of  the  estimate  s^(t)  and  the  rate  of  almost  sure  convergence. 


Theorem  3.3.  Assume  that  s(t)  is  Lip  y  on  I,  0  <  y  <  1,  that  Assump¬ 
tions  (A),  (B)  and  (C)  are  satisfied,  and  in  the  case  of  (Bl)  or  (B2)  that 
f (x)  €  L2Jl[(ji(x;o)dx]  for  some  positive  integer  i  satisfying  i  >  1  +  y”*  for 
0  <  y  <  1/2,  and  «•  >3  for  1/2  <  Y  <  1 .  Then  with  a  any  constant  satisfying 
0  <  a  <  (l/2)(min(Y,l/2)  -  1/t),  we  have 

(a)  under  (Cl):  For  each  fixed  t  >  0  and  each  fixed  sequence  of  sampling 
rates  Wnt<*>  as  n  +  ®  ,  we  have  with  probability  one 

(WJa  sup  |  sw  (t)-s(t) J  -v  0  as  N  +  » 

N  n?  N  n 

(b)  under  (C2):  For  each  fixed  0  <  t  <  1  and  with  W  =  n,  a  positive  integer, 
we  have  with  probability  one 

ff  sup  (s  (t)-s(t)[  0  as  N  -*■  »  . 

n  2  N  n 

As  an  example,  when  f(x)  is  bounded  and  monotonic  (e.g.  hardllmiter, 
quantizer)  we  have  <  (l/2)min(y ,l/2)(as  i  may  be  taken  arbitrarily  large);  and 
thus  for  Lip  1  signals  we  have,  in  particular. 


(W_)°  |su  (t)-s(t)|  -*-0  as  n  • 

"n 

with  probability  one  for  all  a  <  1/4. 

We  finally  show  that,  under  certain  conditions,  the  estimate  Sy(t)  Is 
asymptotically  normal  and  asymptotically  Independent  at  distinct  times. 

Theorem  3.4.  Assume  that  s(t)  is  Lip  y,  1/2  <  y  «  1,  and  that  Assump¬ 
tions  (A),  (8)  and  (C)  are  satisfied.  In  addition,  assume  that  under  (81)  or  (82) 
we  have  f (x)  €  Lp[4(x;o)dx]  for  all  p  >  1,  and  under  (B3)  that  the  noise  X  Is 
uniform  over  [-c,c]  with  c  >  b.  For  t  €  I  define 

s„(t)  *  ByUKSyU)  -  s(t)]  , 

where  m?  ~ 

fty(t)  ■  u'[s(t)]  Var"  ' ^[my(t)] 

(a)  For  each  fixed  t  in  the  interior  of  I,  s^(t)  is  asymptotically  standard 
normal  as  H  +  «.  Bounds  on  the  normalization  factor  6^{t)  can  be  obtained  from  (35). 

(b)  For  the  Szasz  operator  in  (Cl)  (A(z)  1)  we  have,  in  addition,  that  the 

values  of  the  process  (sw(t),  t  >  0}  at  distinct  t's  are  asymptotically  independent 
as  H  . 

Some  comments  on  Theorem  3.4.  First,  the  theorem  remains  true  if  the 
statement  "s(t)  is  Lip  y,  1/2  <  y  s  1“  is  replaced  by  "u)(s-,6)  *  o(6^2)  as  s  +0" 

(cf.  the  proof  of  Theorem  4.4).  Second,  part  (b)  of  Theorem  3.4  remains  true  if 
the  Szasz  operator  is  replaced  by  a  generalized  Szasz  operator  for  which  A(z)  is  a 
polynomial  (cf.  the  proof  of  Proposition  4.1(b)).  The  question  of  asymptotic  inde¬ 
pendence  in  the  Bernstein  case  (C2)  is  open  at  present.  Finally,  the  normalizing 
factor  B^(t)  will  take  a  simple  form  If  the  exact  rate  of  convergence  of  Var[my(t)] 
can  be  established.  Specifically,  we  have  obtained  in  Theorem  4.4  upper  and  lower 
bounds  on  Var[mM(t)]  of  the  form 

0  <  A-j (t)/ /If  <  Var[my(t)]  <  A2(t)//"W~ 
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for  some  specified  functions  A^(t),  i  *  1,2.  When  s(t)  is  a  constant,  we  find 
A-j(t)  =  Ag(t)  in  which  case  the  rate  of  convergence  of  VarCm^t)]  is  exactly  1//TT 
If  it  can  be  established  that  this  rate  is  valid  for  all  signals  s(t)  satisfying 
Assumption  (A),  we  would  then  obtain 

A(t)  =  lim  W1'2  Var[my(t)] 

W  -*•  ®  w 

and  the  central  limit  theorem  for  sw(t)  could  be  stated  in  the  more  standard  form: 

1  /4  ^ 

W  [sw(t)-s(t)]  is  asymptotically  normal  with  mean  zero  and  variance 
A(t)/(u"[s(t)]}2  • 

A  final  comment  in  this  section.  For  signals  s(t)  defined  on  [0,®)  we 
always  assumed  uniform  continuity  of  s ( t )  over  [0,®)  and  obtained  results  valid 
on  (0,®)  (uniformly  on  finite  subintervals).  For  signals  defined  on  [0,®)  that 
are  continuous  but  not  uniformly  continuous,  using  the  results  of  [8],  we  obtain 
results  similar  to  those  of  Sections  II  and  III  valid  over  finite  subintervals  of 
(0,»)  (and  expressed  in  terms  of  the  modulus  of  continuity  of  s(t)  over  each  such 
subinterval).  These  results  are  of  obvious  interest  but  are  not  stated  here 
explicitly  to  avoid  overburdening  the  text. 

IV.  CONVERGENCE  PROPERTIES  OF  THE  ESTIMATE  m^t) 

Let 

m(t)  =  E[f(s(t)  +  X)]  =  u(s(t)),  t  <=  I 

be  the  mean  function  of  the  output  of  the  nonlinearity  f(x),  where  u(s)  is  defined 
in  (10).  We  establish  the  mean-square  consistency,  strong  consistency,  and  a 
central  limit  theorem  for  niy(t),  given  in  (1),  as  an  estimate  of  m(t).  These 
results,  which  are  of  independent  value,  are  given  in  Part  (b).  In  Part  (a)  we 
collect  certain  properties  of  the  function  p(s).  In  order  not  to  overburden  the 


text,  the  proofs  of  all  the  propositions  are  delegated  to  an  Appendix. 

(a)  Properties  of  the  Moment  Function  y(s) 

For  each  k  =  1,2,...  define  w^ls)  by 

y^s)  =  E[fk(s+X)],  -»  <  s  <  ®  .  (14) 

2 

When  X  is  N(0,o  ),  (s) ,  denoted  by  y^  ^(s),  we^  defined  whenever 

f(x)  €  f.2((^(x;a)c*x3  as  follows  by  the  inequality 

lwN,k(s) I  5  es2/2°2  (E[f2k(X)]}1/2  (15) 

shown  in  [3].  The  following  properties  of  y^  -|(s),  denoted  simply  by  yN(s),  were 
shown  in  [3].  yN(s)  is  infinitely  differentiable.  If  f(x)  c-  L2[>(x;o)dx]  is 
monotonic  (not  necessarily  strictly  monotonic)  then  yN(s)  is  strictly  monotonic  and 

y'(s)  >  0  for  all  s  . 

op 

If  f(x)  €  L2[i>(x;o)dx]  is  odd  and  has  nonnegative  Hermite  coefficients  » 

then  y^(s)  is  strictly  monotonic  with  u^(s)  >  e 1  for  all  s  and  if,  moreover, 
e^  >  0  then 

y^(s)  ~  e-|  >0  for  all  s  . 

* 

We  shall  need  (and  use)  strictly  positive  lower  bounds  on  |v^(s)|.  Note  that  it 
is  possible  to  have  i^(s)  +  0  as  |s|  -*■  00  (e.g. ,  if  I f (x)  I  s  H  and 
11m  ^.+00f(x)  =  ±H)  and  in  such  cases  s  would  have  to  be  limited  to  a  bounded  set 
of  values. 

When  X  is  uniform  over  [-b,b]  and  f(x)  =  sgn  x,  then  Vy  k(s)  is  clearly 
well  defined  for  all  s  and  all  k  and  ^(s),  which  is  denoted  by  Vy(s),  is  given 

by  (3b).  Also 

uj(s>  =  ^  >  0  for  | s |  sb  . 

In  proving  a  central  limit  theorem  for  my(t)  and  sw ( t )  we  shall  need  the 
following  property: 
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min  Var[f(s+X)]  >  0  .  (16) 

is t  £  b 

When  X  is  N(0,o2),  Var[f(s+X)]  ■  y^  2(s)  -  uj^  ^ (s)  which  is  a  continuous  function 
of  s.  Thus,  to  show  (16),  it  suffices  to  show  that  Var[f(s+X)]  f  0  for  all 
—  <  s  <  ®  .  Indeed,  if  for  some  s,  Var[f(s+X)]  *  0,  then  f (s+x)  *  Const  for  almost 
all  x  with  respect  to  the  normal  density  *(x;o)  and  thus  f(y)  *  Const  for  almost  all 
y,  which  contradicts  our  hypothesis  that  f(x)  is  not  a  constant  function.  When  X 
is  uniform  over  [-c,c]  with  c  >  b,  and  f(x)  =  sgn  x,  then  (16)  follows  from 

2  2 

min  Var[sgn(s+X)]  =  1  -  max  (7)  ■  1  -  (7)  *  0 
Is!  <  b  | s |  s  b  ' c ' 

Finally  we  shall  use  the  following  finite  and  nonzero  constants  whose 
existence  under  Assumption  (Bl)  or  (B2)  follows  from  the  above  discussion  and 
under  (B3)  is  evident. 


1 

f  min  Uu(s) 

1*1 

,  under  (Bl) 

*  min  wuj(s)  *  e, 

|  Is  |  <  - 

,  under  (B2) 

(17a) 

min  u'(s) 

^  Is  |  <  b 

,  under  (B3)  . 

Q  « 

max  y'(s) 

Is  1  2  b 

,  under  (B) 

(17b) 

V 

max  Var[f(s+X)3 

Isl  -  b 

,  under  (B) 

(17c) 

V2* 

min  Var[f(s+X)3 

Is|  s  b 

,  under  (B) 

(17d) 

A  * 

min(u(c)  -  u(b),  u(-b)  - 

u(-c)>  ,  under  (Bl) 

(17e) 

?  q(c-b). 

Note  that  V2  *  0  under  (B3)  but  V2  >  0  under  the  modified  (B3)  where  X  is  uniform 
over  [-c,c]  with  c  >  b.  V2  is  used  only  under  modified  (B3)  when  needed. 
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(b)  Convergence  Properties  of  rn^t) 

We  begin  by  considering  the  mean-square  error  for  a  fixed  t  €  I  , 

E[i^,(t)-m(t)]2  =  B1asZ[a^(t)]  +  VarCm^t)]  . 

We  have 

■5^1 1)  =  I  Zw<k  hH(t,k)  .  (18) 

with  Zw  k  ■  f(s(k/W)  +  Xk).  Since  the  Z's  are  independent  we  have  for  all  k, 

E[Z2  .  ]  <  sup  E[f(s+X)]2  <  - 

w‘k  Is)  <  b 

by  Assumptions  (A)  and  (B)  (cf.  (15)).  Thus  the  series  (18)  converges  in  quad- 

r\ 

ratic  mean,  as  well  as  with  probability  one,  provided  Ikhy(t,k)  <  ».  Then,  since 
E[ZWtk]  =  u(s(k/W))  =  m(k/W) ,  we  have 

E[^(t)]  =  l  m(k/W)hM(t,k)  -  Pw(m,t)  . 
k 

If  h^(t,k)  2  0  for  all  k,  then  Py  is  a  positive  linear  operator  and  by  a  well- 
known  result  in  approximation  theory  (see,  for  instance,  Devore  [9,  pp.  28-29]), 
if  ly  hw(t,k)  =  1  and  m(t)  is  uniformly  continuous  on  I,  then 

|Bias[mw(t)] |  =  |Pw(m,t)-m(t)|  s  2<*>(m*,aw(t))  ,  (19) 

where  w(m;6)  is  the  modulus  of  continuity  of  m(t)  over  I  and 

«£(t)  =  Pw((t-t)2,t)  =  l  (£  -  t)2  hw(t,k)  .  (20) 

Also,  using  (17c),  we  have 

Var[my(t)]  =  l  Var[Zw>k]  h2(t,k)  <  U2  v2(t)  (21) 

where 

vy(t)  S  I  h2(t,k)  .  (22) 

Hence  for  each  t  €  I  for  which  hw(t,k)  2  0  for  all  k,  Ikhw(t,k)  =  1,  and 
IfchyU.k)  <  •,  we  have 
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k 


E[i^(t)-m(t)]2  i  4ai2(m;aw(t))  +  U2  v2(t)  .  .  (23) 

Thus  if  ag(t)  -*■  0  and  v^(t)  0  as  W  «  it  follows  that  m^t)  converges  in  quad¬ 

ratic  mean  to  m(t)  as  W  -*•  ».  This  simple  result  is  stated  below. 

Theorem  4.0.  Under  Assumptions  (A)  and  (B),  and  for  every  t  6  I  for 
which  hy  satisfies  conditions  (i)-(iv)  of  Theorem  3.0,  we  have  that  m^(t)  converges 
in  quadratic  mean  to  m(t)  as  W  +  ». 


The  following  proposition,  whose  proof  is  given  in  the  Appendix,  is  used 
in  determining  bounds  on  the  rate  of  convergence  of  m^(t). 

Proposition  4,1.  (a)  Let  H'(x)  be  a  2ir-periodic  function  continuously 
differentiable  on  [-*,*]  with  Fourier  series  ^(X)  =  exp(ikx).  Then  for 
t  =  2,3,..., 


1  r*  r*  t  \  *“1 
I  (O  - - jn  /  •••/  »  -I  M  n  0(x.)dx 

k  ''ivy  -it  -it  \i=l  7  j=l  3  3 


3 


k=-®  *  (2ir)-  ■  "-iv  '-it  \  1  =  I  V  y- 

(b)  For  the  generalized  Szasz  operator  (13)  we  have  for  t,  t-j ,  t2  >  0  , 

«  exp(-W(t,+t„))  »  f/t, vk/2  /t?\k/2]  _ 

l  hu(tvk)hM(t2,k)  - - *-*-  — -  l  ekbk  -rM  +M-  I72WVO;) 

<=0  W  W  1  2A  il )  k=0  K  K[\  2/  Vn/  J 


(i) 


where  eQ  =  1 ,  ek  =  2  for  k  >  1 ,  bk  =  £“=Q  a^a^.k  >  0  and  Ik(x)  is  the  modified 
Bessel  function  of  the  first  kind  of  order  k. 


(ii) 


where 


vjj(t)  £  J  h2(t,k)  , 


k=o 


■w 


2/m 


re"2Wt  I0(2Wt)  <  v2(t)  <  e'2Wt  IQ(2Wt)  , 

r  ■  a'2(i>Co  4  ■ 


Oil)  j0  =  (e-w*  I0(Wt))‘-’  = 


U-D/2 


,  A  >  3 
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(c) 

(1) 

(ii) 


For  the  Bernstein  operator  (6)  we  have  for  0  <  t  <  1  , 


l  hJCt.k) 

k=0  w 


1  +  0(1) 

2  Awt(l-t) 


U-D/2 


l  >  3 


Theorem  4.1 .  Under  Assumptions  (A),  (B)  and  (C)  we  have  for  each  t  €  I, 
E[mw(t)-m(t)]2  5  4u2(m;aw(t))  +  U2  v2(t) 

where  the  constant  U2  is  given  by  (17c)  and  <*y(t)  and  v^(t)  are  as  in  Theorem  3.1. 


Proof.  The  general  bound  on  the  mean-square  error  is  given  by  (23).  We 

o 

only  need  to  show  that  m(t)  is  uniformly  continuous  on  I  and  to  compute  ay(t)  and 
vw(t)  under  (Cl)  and  (C2).  Since  m(t)  =  u(s(t))  and  u(s)  is  continuously  differ¬ 
entiable  with  bounded  derivative  over  the  range  of  s(t)  (cf.  17(b)),  the  uniform 
continuity  of  m(t)  follows  from  that  of  s(t).  In  fact  it  is  easily  seen  that 

<o(m;S)  5  Qoj(s;<5)  (25) 

2  2 

where  the  constant  Q  is  finite  by  (17b).  Next  we  compute  a^,(t)  and  Vy(t)  under 
(Cl)  and  (C2).  For  the  generalized  Szasz  operators,  (Cl),  we  have  by  (20)  and  (13) 
.-Wt 


ajj(t)  =- 


2  I  (k-Wt)2  p  (Wt)  =  + 

A(1)W  k=0  K  w  W^A(l) 


where  the  last  step  follows  by  the  expression  for  the  series  given  in  [6].  The 
o 

expression  for  v^(t)  under  (Cl)  follows  by  Proposition  4.1(b.ii).  For  the  Bernstein 
2 

operator,  (C2),  ay(t)  is  equal  to  the  variance  t( 1 -t)/W  of  the  binomial  distribution 
2 

(W,t)  and  vw(t)  is  given  by  Proposition  4.1(c).  □ 

Next  we  consider  the  convergence  of  m^(t)  in  the  2t  mean.  The  follow¬ 
ing  proposition  on  the  cumulants  of  m^(t)  is  needed  and  its  proof  is  given  in  the 
Appendix. 
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Proposition  4.2.  Let  Assumptions  (A),  (B),  and  (C)  be  satisfied.  Let 
r  be  a  positive  integer  and  under  (Bl)  or  (B2)  assume,  in  addition,  that 
f(x)  <=  ^r[((i(x;a)dx].  Then  for  every  choice  of  points  {t^,...,tr>  in  I,  the  joint 

a 

cumulant  of  mat)  of  order  r  satisfies 

r 

(a)  |Cumr{mw(t1),...,m^;(tr)}|5  Mr  [  n  hw(ti,k) 

k  i — 1 

for  some  finite  positive  constant  M  . 


(b)  |Cumr{mw(t1 ),..., mw(tr)}(  5 


where 


Mr[l  +  o(l)3 


(2nw/r_1^/2  |  D(t.) 


(r-1 )/2r 


D(t)  = 


t  ,  under  (Cl) 

t(l-t)  ,  under  (C2)  . 


Theorem  4.2.  Under  the  Assumptions  of  Theorem  3.2  we  have 

e[ii\i(t)-«.(t)]2t  <  n  +  °0)> 

for  some  continuous  function  F.  (t)  specified  in  the  proof. 

*  9  • 

Proof.  For  notational  convenience  we  write  m,  m  for  rr^U),  m(t), 
respectively.  Since  m-m  =  Bias[m]  +  (m-E[m]),  we  have 

E[m-m]2*  =  (BiasCrn])2*  +  \  ( f) (Bias[m])2*'j  E[m-E[m]]j  .  (27) 

j=2  ' 

Since  an  estimate  for  Bias[m]  has  already  been  obtained  in  (19),  we  seek  an  esti¬ 
mate  for  E[m-E[m]]J.  We  recall  that  with  n  =  m-E[m] 


E[nr]  =  I  l  n  Cum  (n,...,n>,  r>  2  ,  (2( 

p=l  i=l  vi 

where  the  inner  sum  extends  over  all  partitions  (vj,...,v  )  of  the  set  {l,...,r} 
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satisfying  +  •  •  •  +  vp  =  r  [10].  Now  any  partition  (^.....v  )  with  p  >  [r/2] 
(the  integer  part  of  r/2),  will  necessarily  have  a  factor  Cum^{n}  =  E[rJ  =  0  in 
the  product  of  cumulants  in  (28).  Thus  the  range  of  p  in  (28)  is  reduced  to 
p  =  l,...,[r/2].  Next  we  note  that  Cum^n,. . .  ,n>  =  Cum^lm,. . .  ,m)  for  v  ^2  and  by 
Proposition  4.2(b)  we  have 


[Cum  (n,. . . ,n) 


MJ1  +  o(l)] 


(2ttW  D(t)} 


(v-1)/2  * 


v  >  2 


Thus  for  each  p-1 . ,[r/2]  we  have 


l  n  Cum  (n. . . .  »n>  s  H  — E1  t-Q-CUL 
1-1  vi  p  (2.WD(t)}(rlPJ/2 


with  H  =  Tn?  ,  M  .  (29)  implies  that  the  dominant  term  in  (28)  as  corre- 

P  I  -  I  vj 

sponds  to  p  =  [r/2]  so  that  for  r  5  2  , 


r  >  2 


Since  s(t)  is  Lip  y,  0  <  y  £  1,  i.e.,  w( s ; 6)  £  Ls<$Y,  then  by  (25)  m(t)  is  also 
Lip  y  with 

<  L,/;  Lm  -  L$Q  . 

p 

Thus  from  (19)  and  aw(t)  =  (D(t)/W)[l  +  o(l)](cf.  expressions  in  Theorem  3.1)  we 
have 

|Bias[m]|<  2Lm(D(t)/W)Y/2[l  +  0(1)]  .  (31) 

It  then  follows  by  (30)  and  (31)  that  (27)  can  be  bounded  by 
E[m-m]  2*  £  (2Lm)2*  (D(t)/W)7*  [1  +  o(l)] 


where 

ej  •  *y  -  1  (y-1  )  -  ,  j  =  2,3,. ...24 
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We  now  seek  the  dominant  term  in  the  above  bound  as  W  +  ®.  This  depends  on  the 
value  of  y. 

(a)  For  1/2  <  y  £  1»  the  dominant  term  in  (32)  corresponds  to  j  =  1%  for 
which  025,  =  4/2  and  thus 

HJ1  *  °n>] 


E[m-m] 


{2irD(t)}i/2WA/2 


(b)  For  0  <  y  <  1/2,  the  sum  is  o(]/l~yx')  so  that 


j-Y4\ 


E[m-m]2£  < 


(2l  )U  {D(t))Y£ 
_ m _ r 


■Y*. 


■[1  +  o(l)] 


(c)  For  Y  =  1/2,  the  terms  in  (32)  corresponding  to  j  odd  are  o(W"^2)  and 


are  negligible  relative  to  the  remaining  terms.  Thus 

i2Z-j 


Etm-m]21  5  |  ”  fj)-^ - J77Hj/2>  0  +  °<')]  . 

WV2  j  3=0  '  JM2irD(t)}J/2  j/2( 

(j  even  j 

These  results  can  be  combined  for  all  0  <  y  £  1  in  the  form  given  in  the  theorem 

where  F»  At)  can  easily  be  identified  from  the  above  analysis.  □ 

K  »  ' 

I  A  . 

We  next  obtain  the  strong  consistency  of  m^t).  The  result  is  identical 
to  Theorem  3.3  but  with  mw(t)-m(t)  replacing  s^(t)-s(t). 

Theorem  4.3.  Under  the  Assumptions  of  Theorem  3.3,  my(t)-m(t)  satisfies 
its  conclusion. 


Proof.  Fix  t  in  the  interior  of  I  and  consider  the  estimate  m^}(t )  as  a 


function  of  W:  Define  a  process  {ou,  u  >  0}  by 


m(t) 


u  =  0 


nu  = 


). 

I  mi/u(t)  ’  u  *  0 


(33) 
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(a)  Under  (Cl),  (nu,  u  >  0}  is  not  necessarily  separable.  Fix  a  sequence 

{W  }“  .  with  W  +  »  and  let  {n  ,u  >0}  be  a  separable  version  of  (n  .  u  >  0}  with 
n  n=  i  n  u  u 

a  separating  set  which  includes  the  points  Uq  =  0  and  up  *  1/Wn>  n  >  1.  Then  for 
any  6  >0  we  have 

sup  |nu  -n0|  5  sup  In  -  JU  .  (34) 

u  <6  n  u<6 

n 

Now,  since  the  two  processes  (nu»  u  >  0}  and  {nu»  u  >  0}  have  the  same  finite 
dimensional  distributions,  it  follows  by  Theorem  4.2  that 


E|nu  -  n0|2*  s  {Kt>Y(t)[l  +  o( 1)]}  ue+1  , 

where  6=*,  min(y,l/2)-l .  It  then  follows  by  Kolmogorov's  theorem  (see  Neveu  [11, 
p.  97])  that  with  probability  one 

JL  sup  |n  -  nJ  >  0  as  6+0 
5  u<6 

for  any  0  <  a  <  b/2«..  Hence  by  (33)  and  (34)  we  have,  with  probability  one, 

_L  sup  |mu  (t)-m(t)|  ->•  0  as  6  +  0  , 

6a  VWns6  n 


and  the  result  follows  by  choosing  6  =  1/W^. 

(b)  Under  (C2),  W  =  n  (an  integer)  so  that  (nu»  uiO}  is  separable. 
Theorem  4.2  and  Kolmogorov's  theorem  imply  that,  with  probability  one. 


—  sup  I nT/n  "  nol 
6a  l/n56 


as  6  +  0 


and  the  result  follows  by  choosing  6  =  1/N.  □ 

We  finally  derive  a  central  limit  theorem  for  the  estimate 
Define  the  normalized  error  process 
mw(t)-m(t) 


^(t)  = 


Var1/2[mw(t)] 


,  t  €  I  . 
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Theorem  4.4.  Under  the  assumptions  of  Theorem  3.4  we  have 

(a)  For  each  fixed  t  in  the  interior  of  I,  iiiy(t)  is  asymptotically  standard 
normal  variable  as  W  ■+  «>  .  The  normalizing  factor  Var  [m^(t)]  satisfy 

{24o(T)/U2}1/2  W1/4[l  +  oO)]  <  Var"1/2[nly(t)] 

5  {2AD(t)/V2}1/2  W1/4[l  +  o(  1)]  (35) 

where  the  constants  U2  and  V2  are  given  in  (17)  and  D(t)  is  given  by  (26). 

(b)  For  the  Szasz  operator  in  (Cl)  (A(z)  =  1),  we  have  in  addition,  that  the 
values  of  the  process  (m^t),  t  >  0)  at  distinct  t's  are  asymptotically  independent 
as  W  +  »  . 


Proof.  Putting 


£w(t)  = 


m^t)  -  ECm^t)] 
Var1/2[,yt)] 


we  have  that 


> 


"^(t)  =  cw(t)  + 


Bias[mw(t)] 

Var1/2Cmw(t)3 


The  proof  is  accomplished  by  showing  that  as  W  -►  ®  the  second  term  goes  to  zero 
and  Sw(t)  has  the  asymptotic  properties  stated  in  the  theorem. 

Under  (Bl),  (B2)  and  the  modified  (B3),  we  have  by  (17c)  -  (17d) 

0  <  V2  <  Var[Zw>k]  <  U2  <  -  ,  (36) 

and  thus  by  (21), 

V2  v2(t)  <  Var[m^(t)]  <  U2  v2(t)  (37) 

o 

(with  equality  when  s(t)  is  constant).  Using  the  asymptotic  expression  for  v^(t) 
given  in  Proposition  4.1(b)(c)  we  have 
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VJl  +  o(l)]  .  U«[l  +  oO)] 

■■  7='—  s  Var[mu(t)]  < 

2/nO(t)W  *  2/irO(t)W 


Hence  by  (31)  and  (38),  since  y  >  1/2  , 


|B,asD^(t)]|  .o  i/z(r,/2) 

Var1/2[iu(t}] 


0 


(38) 


We  now  establish  the  desired  asymptotic  results  for  CyU)  for  t  in  the 
interior  of  I.  It  is  clear  that 

E[cw(t)]  =  0,  Var[ew(t)]  =  1  . 


For  Part  (a)  we  show  that  for  each  fixed  t,  all  cumulants  of  £y(t)  of  order  r  >  3 
»  tend  to  zero  as  W-+»;  the  asymptotic  normality  of  £w(t)  follows  then  from 
Lemma  P4.5  of  [12].  For  Part  (b)  we  show  that  for  all  r  >  3  and  all  instants 
tj»..-»t  >  not  necessarily  distinct,  the  joint  cumulant 

CumrUw( ^ ),..., £w( tf) }  0  as  W  -  ,  (39) 

and  in  addition 

E[cw(t1  )cw(t2)l  ->  0  as  W -*•  -  for  t]  t  t2  .  (40) 


It  will  then  follow  by  the  same  Lenina  of  [12]  that  all  finite  dimensional  distribu¬ 
tions  of  the  process  (Cw(t),  t  >  0}  converge  to  the  finite  dimensional  distributions 
of  a  Gaussian  process  with  mean  zero  and  covariance  R(t^,t2)  =  1  for  t^  *  t^,  and 

R(t-|,t2)  *  0  for  t1  f  t2,  i.e.,  with  independent  values  at  distinct  points.  Both 

* 

goals  will  be  achieved  if  we  show  (39)  in  general,  and  (40)  in  the  Szasz  case, 
which  we  now  proceed  to  do.  For  r  i  3  and  {t^}  in  the  interior  of  I,  we  have 

Cumr{mw(t1 ),...,  i"w(tr)} 

n  Var1/2[in^(tf)] 

and  using  the  upper  bound  in  Proposition  4.2(b)  for  the  numerator  and  the  lower 
bound  in  (38)  for  each  factor  in  the  denominator,  we  obtain 


Cumr  { Cy  ( t-j ) , . . .  tp) ) 
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am 


u 


|CumptCM(t1)....,?w(tr)}|  =  o(W(r“2)/4)  -0 

since  r  ?3.  Next  we  prove  (40)  for  the  special  case  of  the  Szasz  operator  in  (Cl). 

Note  that  by  Proposition  4.1(b),  specialized  for  the  Szasz  case,  we  have 

»  -W(t,+tJ  _ _ 

I  hH(tl.k)hH(tz,k)  -  e  1  2  I0(2W^)  . 

Now 

oo 

Cov{mw(t1),mw(t2)}  =  £  Var[Zw>k]hw(trk)hw(t2,k) 
and  by  (36)  and  (41) 

-WUj+tg) 


(41) 


|Cov{mw(t1),mw(t2))|  <  U2  e 


I0(2Wv^) 


By  (37)  and  (41)  we  have  Varfm^t)]  >  V2  e"2Wt  IQ(2Wt),  so  that 

Up  In(2WvOT) 

|E[Sy(t.j )c^(t2)] |  <-^  ■  ■  ■  7~T7T 

W  1  W  2  V2  {I0(2Vlt1)I0(ait2))1/Z 

Using  the  asymptotic  expansion  [13,  p.  86]  for  large  x,  IQ(x)  *  (21rx)’1^2ex(l+o(l/x)), 
we  obtain  for  t^  f  t2  as  W  ->  «  , 

|E[«M(t1)ew(t2)]|  [i  +<,(i)]*o  . 

Finally,  the  bounds  on  Var“1^2[mw(t)]  follow  from  (37)  and  Proposition  4.1.  □ 


»• 


V.  PROOFS  OF  THEOREMS  OF  SECTIONS  II  AND  III 
Using  the  convergence  results  for  my(t),  proven  in  Section  IV(b),  and  the 
relationships  sy(t)  *  gCm^t)],  m(t)  *u[s(t)],  we  now  establish  the  convergence 
results  for  s^(t)  stated  in  Sections  II  and  III.  The  basic  link  between  the  prop> 
ertles  of  sw(t)  and  if^(t)  is  provided  by  the  following  proposition  whose  proof  is 
given  in  the  Appendix. 
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Proposition  5.1.  Let  Assumptions  (A)  and  (C)  be  satisfied.  Then 

(a)  under  (Bl),  with  p  2  1,  we  have 

E|sw(t)-s(t)|p  ^  [(l/q)P  +  (b/MP]  E|t^(t)-m(t)(p  , 

(b)  under  (B2)  we  have 

|sM(t)-s(t)|  <  (1)  !mw(t)-m(t) j  , 

(c)  under  (B3)  we  have 

sw(t)-s{t)  =  b[ify(t)-m(t)] 

where  the  constants  q  and  a  are  defined  in  (17)  and  b  is  the  upper  bound  for  s(t). 

Theorems  3.0-3. 2  follow  immediately  from  Theorems  4. 0-4. 2,  respectively, 
and  Proposition  5.1.  Theorem  3.3  follows  from  Theorem  3.2  and  Kolmogorov's 

theorem  [11,  p.  97]  in  the  manner  of  the  proof  of  Theorem  4.3.  The  deduction  of 

Theorem  3.4  from  Theorem  4.4  is  given  below.  Finally,  Theorems  2. 1-2.4  follow 
immediately  from  Theorems  3. 1-3. 4,  respectively.  (In  Theorem  2.1,  for  the  esti¬ 
mate  (8a)  under  (Bl),  the  values  of  the  constants  Kp  K 2  are  obtained  from  those 
of  Theorem  3.1  by  using  the  computed  values  q  =  2$(b+e,a) ,  Q  =  /2/w/o,  =  1,  and 

the  inequality  a  >  q(c-b);  the  use  of  this  inequality  results  in  a  simple  expres¬ 
sion  for  K-j  and  K 2-) 

Proof  of  Theorem  3.4.  (a)  Fix  t  in  the  interior  of  the  interval  I.  By 

Theorem  4.4(a),  the  distribution  of  [mw(t)-m(t)3/Var'//2[my(t)3  converges  to  the 
distribution  of  a  standard  normal  variable,  say  A  result  of  Mann  and  Wald 
[14,  p.  226]  shows  that  if  g(x)  has  a  continuous  first  derivative  in  the  neighbor¬ 
hood  of  m(t),  and  g'(m(t))  t  0,  then  the  distribution  of  (g[m( t) ]  -  g[m(t)]}/ 
Var^2[my(t)]  converges  to  the  distribution  of  the  normal  variable 
g  (m(t))?t.  Since  !s(t)|  5  b,  m(t)  =  u(s(t))  takes  values  in  the  interval 
[v(-b),w(b)]  for  all  t  €  I.  Thus  under  (Bl),  (B2)  or  the  modified  (B3)  stated  in 
the  theorem,  g(x)  is  continuously  differentiable  over  an  Interval  containing 
[w(-b),p(b)]  and  g'(x)  >  0  for  w(-b)  <  x  <  u(b).  It  follows  that  the  distribution  of 
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sw(t)-s(t) 

Var1/2[n^(t)] 


gCm^t)]  -  g[rn(t)] 
Var1/2[r^(t)] 


converges  to  the  distribution  of  the  normal  variable  g'(m(t))ct  and  the  result 
follows  from  g'(m(t))  =  1/p'[s(t)]  . 

(b)  Let  {t^}.^  be  distinct  points  in  (0,»).  By  Theorem  4.4(b),  the  dis- 

A  1/ 

tribution  of  (niy(t^)}.._.|  converges  to  the  distribution  of  independent  standard 
normal  variables,  say  as  U  ■>  ».  Again  by  the  result  of  Mann  and  Wald  [14, 

L  x 

p.  226],  the  distribution  of  e ^ ( t^ )  converges  to  the  distribution  of 

the  normal  variable  e^g"[ni(t^ )2lc^  whose  mean  is  zero  and  variance  is 
k  2  2 

I-j_1  ^(g'CmCt^ )]}  .  Since  the  9. 's  are  arbitrary,  it  follows  that  the  variables 
~  k 

{sw(t. )>i=1  are  asymptotically  independent  normal.  □ 


APPENDIX 

A.  Proof  of  Proposition  3.0.  It  is  clear  by  (11)  that  hn(t,k)  >  0  for  all  k  and 
£khn(t,k)  =  1.  Also,  that  cn  =  5-|  +  •  •  •  +  has  mean  nt  and  variance  n  Var[^]. 
Hence 

l  (^  -  t)2  hn(t,k)  =^2  Var[^]  =  1  Vartc^  -»•  0  as  n  -►  »  . 

k=-°°  7  n 

Thus  conditions  (i)-(iii)  of  Theorem  3.0  are  satisfied  for  all  t  ?  I.  For  (iv) 
we  have 

00  T 

l  h2(t,k)  =  lim  ±  [  |*  (x)|2  dX 

k=-~  n  T  -T  'Si 

- T"m„  #  C  '^,|2n  di 

where  4>  (x),  <t>_( x)  is  the  characteristic  function  of  c  ,  £. ,  respectively.  Since 

Sp  4  Ml 

is  integer  valued,  ^(x)  is  periodic  with  period  2*.  Consider  all  t  €  I  for 
which  is  a  nondegenerate  random  variable.  Then  4>^(x)  has  a  positive  funda¬ 
mental  period  which,  without  loss  of  generality,  can  be  taken  as  2-n.  Then 
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■V-  t 


4 


3  hn2“'k)'s- f\(*>  i2"** 

K--<»  -ir 


and  (iv)  follows  by  dominated  convergence  since  |<^(x)|  <  1  for  0  <  |x|  <  w.  □ 

B.  Proof  of  Proposition  4.1.  (a)  We  have  ^  *  (1/2 w)J  v(x)exp(-ikx)dx  and  thus 
r  l  r.  r*  tv  4-1  r  -ikx<  1 

i  J  (V  =  -  ,1  *k — ttt  /  •••[  n  lf(Ai)e  J  dM 

ikl<N  k  lklsNk(2»)  “'-IT  -U  j»l  L  J 

t-1  V  t-1 

rr  /  •••  /  f-  (- 
(2*)' 


/;-/>(-,  1,3  a  lf(xW 


i  kA 

where  ’*'N(X)  =  ^ | k J  < N  *ke  '  Since  y(x)  ’s  continuously  differentiable, 

^(x)  f(x)  uniformly  on  [-*,*]  and,  in  fact,  max^i..  J’Kx)  -  vN(x)|<  Const  N"1^2 
(see  [15,  p.  31]).  The  result  follows  by  applying  the  dominated  convergence 
theorem. 

(b)  By  choosing  vt(X)  *  exp[Wt(elX-l)]  A(elX)A’^(l),  we  obtain  *rom  the  gen¬ 
erating  function  A(z)  of  the  Appel  polynomials  (12)  that  ^(t)  =  hw(t,k).  Hence 
by  ParsevaVs  relationship 


oo  ft 

Jq  hW(tl»k)Wk)  =  2 iT  /ff  ft^X^  \2^dX 

-iK^+tjj) 


a2(D 


2^  /1Texp[W(t1eiX+t2e-iX)]|A(eiX)|2dxj  . 

-7T  ) 


But  the  Fourier  series  A(eiX)  =  ^=Q  akeikx  converges  boundedly  and  uniformly  on 
[-*,*],  since  A(z)  is  analytic  in  |z|  <R  for  some  R  >  1,  so  that  by  interchanging 
semination  and  integration  (as  in  Part  (a))  the  expression  in  braces  becomes 


00 


I 

j,*»0 


1  *  W(t1+t2)cosX  i[(j-t)x+w(t|-t2)sinx] 

2 TJe  e  dx 

-IT 


j 


00 


I 

,*=0 


Ia.j(2WV^) 
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by  [16,  p.488].  Noting  that  In(x)  =  I_n(x) .  and  considering  the  sum  for  j  >  i 
and  j  <  i,  we  obtain  Part  (b.i).  For  (b.ii)  we  have  from  (b.i)  with  that 


l  hj(t.k) 
k*0 


I  ekbkIk(»t) 
k=0  *  K 


(Al) 


I0(2Ht) 


On  the  other  hand,  since  |ft(x)|  5  exp(-Wt)exp(Wt  cosx)  we  have 


l  h*(t,k)  =  £  fW  |n(x)|2  dx  <  e‘2wt  I  (2Wt)  , 

k=0  c  J-n 

which  completes  the  proof  for  the  upper  and  lower  bounds  in  (b.ii).  In  order  to 
obtain  the  asymptotic  result  in  (b.ii)  we  note  that  [13,  p.  86]  as  x  -*•  * 

I  (x)  =  -jL  [l  +  (4n2-l)  o(|)l  (A2) 

n  /2irx  L  *  J 

where  the  term  o(l/x)  is  uniform  in  n.  Hence  as  W  -*•  »  we  have  by  (Al) 


since  Ik=0  ekbk  =  [£“=0  =  A2(l).  The  asymptotic  result  will  follow  by  showing 

l£=0(4k2-l)ekbk  <  ®.  Since  A(z)  *  £"a0  anzn  is  analytic  in  |z|  <  R  for  some  R  >  1, 
there  exists  a  constant  0  <  r  <  1  such  that  an  <  Const  rn.  This  Implies  that 
bn  3  rka0  an+kak  5  Const  rn-  ^hus  1^*0  <  ®  an<*  the  result  follows.  For  Part 

(b.iii)  we  have  by  Part  (a)  with  i  »  3 

j0  cytAif  =  j-wivx>l|  jjr/jv*11  dl{ 

and  the  result  follows  by  using  the  bound  |vt(x)|  <  exp(-Wt)exp(Wt  cosx)  and  (A2): 
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jo  [h^t.klj1  <  [e~Wt 


i  X  W 

(c)  By  choosing  4't(X)  =  (t  e  +  1-t)  ,  which  is  the  characteristic  function 
of  the  binomial  distribution,  we  have  ^(t)  =  hw(t,k)  where  hw(t,k)  is  given  in  (6). 
Part  (c.i),  (c.ii)  follow  in  the  manner  of  (b.ii),  (b.iii),  respectively,  using  the 
property 


2“  f  [1 -a  sinZX/2]"dx  = 


„  1  +°0) 


'•naW 


^or  0  <  a  <1,  shown  in  [4],  □ 

C.  Proof  of  Proposition  4.2.  (a)  From  (18)  we  have 


Cum{my{  ti ) , . . .  ,my  ( tr ) )  ~  J  •••  J,  Cum^  {Z^  k  ,•  •  •  ,Z^  k  1 


i=l 


hw(tf,k.) 


=  l  Cum  (Z 
k  r 


W,k’ 


r 

n  h  (t  ,k) 
i=l  w  1 


(A3) 


where  the  first  equality  is  justified  below  and  the  second  equality  follows  from 
the  independence  of  {Zy  k>k  and  the  fact  that  the  joint  cumulant  of  independent 
sets  of  random  variables  is  zero  [12,  p.  19].  Part  (a)  will  follow  from  (A3)  pro 
vided  |Cumr(Zw  k,...,Zw  k)|  <  Mf  for  some  finite  positive  constant  ^  -  which  is 
seen  as  follows: 


CumrfZW,k’"',ZW,k}  = 


r  _  p  r  vn 

l  l  (-Dp(p-1)i  n  E  (Zw  .)  1  I 

3=1  i=l  L  "»*  J 


(A4) 


where  the  inner  sum  extends  over  all  partitions  (v^,...,v  )  of  the  set  {l,...,r> 
satisfying  +  ...  +  v  =  r  [12,  p.  19].  By  Assumption  (A),  for  all  k 


E  Z, 


W,k' 


sup  E|f(s+X)|v  =  Const  <  •  ,  v  =  l,...,r 
!s|  <b 


(A5) 


where  the  last  step  follows  from  (15)  and  f(x)  <=  ^^(xjajdx]  under  (Bl)  and  (B2) 
(under  (B3)  this  is  obvious).  Putting  (A5)  in  (A4)  gives  the  required  bound  Mf  . 
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The  first  equality  in  (A3)  is  justified  as  follows.  Since  cumulants  and  moments 
can  be  expressed  in  terms  of  each  other  (cf.  (A4)  and  (28)),  it  suffices  to  justify 
the  exchange  of  expectation  and  summation  for  moments.  This  will  follow  by  Fubini’s 
theorem  provided 


l  -  l  E 


i=1  ZW,ki  MW 


(A6) 


1  r 

is  finite.  But  by  the  multi-dimensional  version  of  Holder's  inequality,  (A6)  is 
bounded  by 

"  j[  iJ1  {E'ZW,k.Mti»ki)l  }1/  =  i"1  [J  E'ZW,k.  ^  hW(  W] 


The  latter  is  finite  since  E|Zy  k|r  <  ®  by  (A5)  and  Ik[hw(t,k)]r  <  ®  by 
Proposition  4.1 (b)(c) . 

(b)  By  the  rth  dimensional  version  of  Holder's  inequality  for  sums  we  have 

l  n  h  (t.,k)  <  n  [hu(t.,k)]r|1/r 
k  i=l  w  1  H  U  ''  1  1 

and  the  result  follows  by  Proposition  4.1(b)(c).  □ 


D.  Proof  of  Proposition  5.1 

We  provide  the  proofs  In  reverse  order. 

(c)  Under  (B3)  we  have  m(t)  *  (l/b)s(t)  since  | s ( t) |  <  b.  Also 

mw(t)  =  l  sgn[s(k/W)  +  Xk]hw(t,k) 

satisfies  | m^( t) |  5  Zkhw( t ,k)  =  1.  Hence  g(x),  given  under  (B3),  is  used  only  for 
I x |  <  1  and  thus  sy(t)  *  b  rny(t). 

(b)  Under  (B2),  u_1(x)  exists  for  all  x,  s(t)  =  v^Mt)]  and  sw(t)  *  iT^m^t)]. 
The  result  follows  from  the  inequal  ity 

I""1***  •  5 

i  si  <  - 


which  is  valid  for  all  —  <  x,y  < 
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(a)  Under  (B1)  w_1(x)  exists  for  all  x  and  s(t)  =  p-1[m(t)].  For  simplicity 
we  omit  W  and  t  in  the  following.  We  have  by  (Bl) 


(  lu'^m)  -  v-1  (m)  | ,  if  p{-c)  <  m  <  p(c) 

|s-s|  =  < 

*  |s|  ,  otherwise. 

Also  for  p(-c)  <  x,y  s  p(c), 

-  »~'mi  -  -.i,Nl-isr  ii^i 

I  s  I  S  C 

and  thus 

E | s-s j p  £  (l/q)p  Ejm-mjp  +  |s|p  Pr{m  f  [y(-c) ,u(c)]} 

Now 

Pr{m  £  [u(-c) ,p(c)3>  =  1  -  Pr{p(-c)  £  m  £  v(c)} 

=  l-Pr{y(-c)-m  £  m-m  <  y(c)-m} 

£  l-Pr{|m-m|  £  a}  =  Pr{|m-m|  >  A} 

£  (l/A)pE(ni-m|p 

where  the  first  inequality  above  follows  from  (17e)  since  m(t)  =  y[s(t)]  ==> 
u(-b)  £  m  £  y(b)  =»  w(c)-m  -  u(c)-u{b)  2  0  and  u(-c)-m  £  y ( -c )-u ( -b)  s  0.  The 
result  follows.  □ 
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